Collection download: Fetch manifests from Github#611
Conversation
| global _content_manifests_by_grade_name | ||
| global _collection_download_manager | ||
|
|
||
| _collection_download_manager = CollectionDownloadManager() |
There was a problem hiding this comment.
I'm not sure why the download manager initialization needs to be delayed. The only time the manifests come into play is when other code calls _collection_download_manager.from_manifest() or _collection_download_manager.from_state().
There was a problem hiding this comment.
Well, today the manifests are read and parsed at import time. For example when kolibri manage migrate is called and this plugin is enabled. Not ideal. Worse would it be to add an online fetch when running migrations.
| "https://raw.githubusercontent.com/endlessm" | ||
| "/endless-key-collections/main/json" | ||
| "/{grade}-{name}.json" | ||
| ) |
There was a problem hiding this comment.
This seems pretty dangerous to make this use the main HEAD commit at runtime. If someone commits a mistake to main or is in the process of refactoring the collections then all apps are broken immediately. I think if we want to move in this direction then we should make releases of endless-key-collections and use released artifacts.
There was a problem hiding this comment.
Yeah. @pwithnall started creating releases already. There is a ZIP file with all the source. I wonder if the JSON files could be separate release artifacts so they can be requested directly. Otherwise this code should request the ZIP and unpack it.
There was a problem hiding this comment.
I wonder if the JSON files could be separate release artifacts so they can be requested directly.
There’s an unbounded number of JSON files, so I’m not sure that would actually gain you anything. Seems more reliable to be able to make a single request (for the .tar.gz) and then extract all the JSON files from the well-known json/ directory inside.
There was a problem hiding this comment.
Okey. The other improvement I'd do is allow changing the URL in an options.ini plugin setting. That way the same app build will be able to test different collections. It can be useful for content curation and testing (passing a mock collection with fewer content to avoid long download times).
There was a problem hiding this comment.
Another option which doesn't require releases and artifacts but is a little safer is to have a stable branch in endless-key-collections that we occasionally fast forward to a known good point on master. Then you can continue using the direct github links with more confidence that it won't randomly break.
Another thing to consider is using github pages since I expect that would scale a lot better than the raw links. We could arrange that the pages branch is only updated on release.
There was a problem hiding this comment.
The collection JSON is now published to https://endlessm.github.io/endless-key-collections per endlessm/endless-key-collections#29, so I think you should change the base URL in this PR. When new tags are made in endless-key-collections, the published JSON will automatically be updated.
This will be used in next commit. Having the URL configurable will be useful for testing.
c16fc0f to
755ec2b
Compare
|
Ready for review again. I have updated the URL according to #611 (comment) and made it configurable. I think this will be useful to test unreleased collections and alternate collections (quickly iterate through the process with minimum content). |
This way the software update is decoupled from content update. Issue #610
Previously the manifest files were read from disk so it didn't matter to read them all on module import. But now they are fetched from the Internet. So add a decorator to all API views for initiating the global variables and fetch collections only once on first API call.
This is not needed anymore. Remove the setting and the method to parse local manifests.
755ec2b to
2bf1c51
Compare
This way the software update is decoupled from content update.
Issue #610